Singling out functional similarities in graph databases
نویسندگان
چکیده
It has been shown that protein-protein interactions analysis may be useful to infer information about biological variations caused by evolution. All the known protein-protein interactions of a given organism may be modelled by a network, namely, the protein-protein interaction (PPI) network of that organism, stored in a graph database. The analysis and comparison of protein-protein interaction graph databases corresponding to different organisms is useful to retrieve information about conservations across species. In this work, we give a contribution in the direction of individuating functional similarities among proteins belonging to different organisms. The technique core consists in computing a maximum weighted matching of bipartite graphs to compare the neighborhoods of pairs of proteins in different PPI graph databases. The idea is that proteins belonging to different organisms should be matched looking not only at their own sequence similarity, but also at the similarity of proteins they “strongly” interact with, either directly or indirectly. Furthermore, the technique allows for the employment of both quantitative and reliability information possibly available about interactions, making the analysis more accurate. We tested the method on the S. cerevisiae and D. melanogaster PPI graph databases, showing its effectiveness in individuating functionally related proteins.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملFunctional Screening of Phosphatase-Encoding Genes from Bacterial Sources
Phosphatase (APase) enzymes including phytases have broad applications in diagnostic kits, poultryfeeds, biofertilizers and plant nutrition. Because of high levels of sequence diversity among phosphatases,an efficient functional screening method is a crucial requirement for the isolation of the encodinggenes. This study reports a functional cloning screening method for the iso...
متن کاملGraph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملThe KEGG databases at GenomeNet
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is the primary database resource of the Japanese GenomeNet service (http://www.genome.ad.jp/) for understanding higher order functional meanings and utilities of the cell or the organism from its genome information. KEGG consists of the PATHWAY database for the computerized knowledge on molecular interaction networks such as pathways and comple...
متن کاملGraph-theoretic approach to quantum correlations.
Correlations in Bell and noncontextuality inequalities can be expressed as a positive linear combination of probabilities of events. Exclusive events can be represented as adjacent vertices of a graph, so correlations can be associated to a subgraph. We show that the maximum value of the correlations for classical, quantum, and more general theories is the independence number, the Lovász number...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008